Journal: bioRxiv
Article Title: Expanding the DNA Motif Lexicon of the Transcriptional Regulatory Code
doi: 10.1101/2025.07.09.662874
Figure Lengend Snippet: (A) Barcode count distribution. Histograms showing the distribution of unique barcode counts recovered from RNA for TF motif library sequences after transfection in K562 (left), GM12878 (middle), and Jurkat cells (right). Sequences with ≥30 unique barcodes (dashed line) were used for downstream analyses. (B) Reproducibility of the biological replicates. Heatmap displaying pairwise Pearson correlation coefficients for aggregated barcode counts linked with a given TF motif sequence across RNA and DNA biological replicates. (C) Schematic illustrating the determination of simple and composite motif activities from the MPRA datasets. For simple motifs, motif activity was calculated using DEseq2 by comparing the RNA/DNA log2 fold change (log2FC) of motif-containing sequences (sequence activity) as test values and the RNA/DNA log2 fold change (log2FC) of matched mutated control sequences as reference values. For composite motifs, activities were similarly assessed by using the RNA/DNA log2FC of CE motif-containing sequences with those of matched double-mutant controls. To determine individual motif contributions within a CE, the RNA/DNA log2FC values of sequences carrying single-motif mutations were analyzed, with those from matched double-mutant sequences used as controls. (D) Numbers of functional simple and composite TF motifs revealed by MPRA assays. Table summarizing the counts of functional (activating or repressing) simple and composite motifs identified in K562, GM12878, and Jurkat cells, along with their median sequence and motif activities. (E) Comparisons of cCE activities across cell types. Scatter plots of cCE motif activity (log2FC) measured in GM12878 vs. K562 (left), GM12878 vs. Jurkat (middle), and Jurkat vs. K562 (right) cells. The green and blue dots indicate cell type-specific activating or repressing cCEs in the indicated cell types. The red dots are cCEs that are activating or repressing in both cell types. The gray dots are cCEs with no detectable activity in the indicated cell types. Representative simple TF motifs that are constituents of cell-type specific CEs are shown in the left panel.
Article Snippet: These cells were transfected with a Neon transfection system (Thermo Fisher Scientific, MPK5000) using a kit (MPK10096B) with 3 pulses of 1200 V for 20 ms. A total of five biological replicates were performed, each using cells at a density of ∼1 million cells/ml for the transfections.
Techniques: Transfection, Sequencing, Activity Assay, Control, Mutagenesis, Functional Assay